LLTools: Machine Learning for Human Language Processing

نویسندگان

  • C. K. Dagli
  • W. M. Campbell
  • L. Li
  • J. Williams
  • K. Geyer
  • G. Vidaver
  • J. Acevedo-Aviles
  • E. Wolf
  • J. Taylor
  • J. P. Campbell
چکیده

Machine learning methods in Human Language Technology have reached a stage of maturity where widespread use is both possible and desirable. The MIT Lincoln Laboratory LLTools software suite provides a step towards this goal by providing a set of easily accessible frameworks for incorporating speech, text, and entity resolution components into larger applications. For the speech processing component, the pySLGR (Speaker, Language, Gender Recognition) tool provides signal processing, standard feature analysis, speech utterance embedding, and machine learning modeling methods in Python. The text processing component in LLTools extracts semantically meaningful insights from unstructured data via entity extraction, topic modeling, and document classification. The entity resolution component in LLTools provides approximate string matching, author recognition and graph-based methods for identifying and linking different instances of the same real-world entity. We show through two applications that LLTools can be used to rapidly create and train research prototypes for human language processing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

On the Link between Identity Processing and Learning Styles among Young Language learners

The present study attempted to investigate the probable relationship between Iranian young language learners’ identity processing styles and their learning styles. To this end, 29 advanced learners, 23 females and 6 males were randomly selected from an English language Institute. Twenty nine advanced young language learners were chosen randomly out of whole advanced young language learners in t...

متن کامل

A New Hybrid Meta-Heuristics Approach to Solve the Parallel Machine Scheduling Problem Considering Human Resiliency Engineering

This paper proposes a mixed integer programming model to solve a non-identical parallel machine (NIPM) scheduling with sequence-dependent set-up times and human resiliency engineering. The presented mathematical model is formulated to consider human factors including Learning, Teamwork and Awareness. Moreover, processing time of jobs are assumed to be non-deterministic and dependent to their st...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016